ASYMPTOTIC ENTROPY OF RANDOM WALKS ON REGULAR 
LANGUAGES OVER A FINITE ALPHABET 
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\^' Abstract. We prove existence of asymptotic entropy of random walks on regular lan- 

'^'h ' guages over a finite alphabet and we give formulas for it. Furthermore, we show that the 

»v^ . entropy varies real-analytically in terms of the probability measures of constant support, 

which describe the random walk. This setting applies, in particular, to random walks on 

virtually free groups. 

Oh , 1. Introduction 

"jrt ' Let ^ be a finite alphabet and let A* be the set of all finite words over the alphabet A, 

where o denotes the empty word. Consider a transient Markov chain {Xn)neNo on A* with 
Xq = o such that the transition probabilities depend only on the last K £ N letters of 
the current word, in between two steps the word length differs only by at most K letters 
^ I and in each step only the last K £N letters of the current word may be modified. Denote 

lO ' by TTn the distribution of X„. We are interested whether the sequence -¥,[— log it n{Xn)] 

'^ i converges, and if so to describe the limit. If it exists, it is called the asymptotic entropy, 

which was introduced by Avez [l]. The aim of this paper is to prove existence of the 
^^ . asymptotic entropy, to describe it as the rate of escape w.r.t. the Greenian distance and 

f— ^ I to prove its real-analytic behaviour when varying the transition probabilities of constant 

Cn ' support. 



^ 



We outline some background on this topic. It is well-known by Kingman's subadditive 
ergodic theorem (see Kingman |11) ) that the entropy exists for random walks on groups if 
^ ! IE[— log7ri(Xi)] < DO. In contrast to this fact existence of the entropy on general structures 

is not known a priori. In our setting we are not able to apply the subadditive ergodic theo- 
rem since we neither have subadditivity nor a global composition law of words if we restrict 
the random walk to be on a proper subset of A* . This forces us to use other techniques like 
generating functions techniques. These generating functions are power series with probabil- 
ities as coefficients, which describe the characteristic behaviour of the underlying random 
walks. The technique of our proof of existence of the entropy was motivated by Benjamini 
and Peres [2], where it is shown that for random walks on groups the entropy equals the 
rate of escape w.r.t. the Greenian distance; compare also with Blachere, Haissinsky and 
Mathieu [3]. In particular, we will also show that the asymptotic entropy h is the rate of 
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escape w.r.t. a distance function in terms of Green functions, which in turn yields that h 
is also the rate of escape w.r.t. the Greenian distance. Moreover, we prove convergence in 
probability and convergence in Li of the sequence — - log-7r„(X„) to h, and we show also 
that h can be computed along almost every sample path as the limes inferior of the afore- 
mentioned sequence. The question of almost sure convergence of — log7r„(X„,) to some 
constant h, however, remains open. Similar results concerning existence and formulas for 
the entropy are proved in Gilch and Miiller [8] for random walks on directed covers of 
graphs and in Gilch [7] for random walks on free products of graphs. Furthermore, we give 
formulas for the entropy which allow numerical computations and also exact calculations 
in some special cases. 

Kaimanovich and Erschler asked whether drift and entropy of random walks vary con- 
tinuously (or even analytically) when varying the probabilities of the random walk with 
keeping the support of single step transitions constant. In this article we also show that h 
is real-analytic in terms of the parameters describing the random walk on A* . This fact ap- 
plies, in particular, to the case of bounded range random walks on virtually free groups. At 
this point let us mention that several papers concerning continuity and analyticity of the 
drift and entropy have been published recently: e.g., see Ledrappier |13| . |14| . Haissinsky, 
Mathieu and Miiller [9], Gilch f^. The recent article of Gilch and Ledrappier [5] collects 
several results about analyticity of drift and entropy of random walks on groups. 

The reasoning of our proofs follows a similar argumentation as in [8] and [7]: we will 
show that the entropy equals the rate of escape w.r.t. some special length function, and we 
deduce the proposed properties analogously. The plan of the paper is as follows: in Sections 
[2] and [3] we define the random walk on the regular language and the associated generating 
functions. Sections |4] explains the structure of cones in the present context. In Sections [5] 
and [6] we prove existence of the asymptotic entropy, while in Section [7] we give explicit 
formulas for it. Section [8] shows real-analyticity of the entropy. 

2. Notation 

Let ^ be a finite alphabet and denote by o the empty word. A random walk on a regular 
language is a Markov chain on a subset C C A* := Un>i -^^ ^ {^} ^^ ^^^ finite words over 
the alphabet A, whose transition probabilities obey the following rules: 

(i) Only the last two letters of the current word may be modified, 
(ii) Only one letter may be adjoined or deleted at one instant of time, 
(iii) Adjunction and deletion may only be done at the end of the current word, 
(iv) Probabilities of modification, adjunction or deletion depend only on the last two 
letters of the current word. 

Compare with Lalley |12) and Gilch [6]. The assumption that transition probabilities de- 
pend only on the last two letters of the current word may be weakened to dependence of 
the last K > 2 letters by blocking words of length at most K to new letters (compare 
with |121 Section 3.3]. In general, a regular language is a subset of A* whose words are 
accepted by a finite-state automaton. It is necessary that by each modification of a word of 
the regular language in one single step a new word of the regular language is created. The 
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results below, however, are so general such that w.l.o.g. - for ease and better readability - 
we may assume that the regular language L consists of the whole set JC . We will use the 
notation w ^ L^w ^ J^ respectively, to emphasize at some points that we explicitely mean 
a word of the language or just a word over the alphabet. Let us note that random walks on 
virtually free groups constitute a special case of our setting, and our results directly apply. 

We introduce some notation. For a word w & C and A: € N, w[k] denotes the k-th. letter 
of w, and [w] denotes the last two letters of w. The random walk on C is described by 
the sequence of random variables {Xn)n£'No- Initially, we have Xq := o. If we want to start 
the random walk at w (^ C instead of o, we write for short P.u,[-] := P[- | Xq = w]. For 
two words wi,W2 G A*, we write W1W2 for the concatenated word. We use the following 
abbreviations for the transition probabilities: for w £ C, 01,02,61 G A, b2,c £ AU {o}, 
n E No, we write 

P[X„+i = wa2b2C I Xn = waibi] = p{aibi,a2b2c), 

¥[Xn+i = 62C I Xn = ai] = p{ai, 62c), 

^Xn+l=b2\Xn = 0\=p{oM)- 

We assume that Pu[Xi = v] > Q implies Pt,[Xi = u] > for ah u,v £ £. We cah this 
property weak symmetry. For wi,W2 € C, the n-step transition probabilities are denoted 
by p^"-'[wi,W2) '■= Fw-^[Xn = W2]- The natural word length of any lu G £ is denoted by \w\. 
Malyshev |15) proved that the rate of escape w.r.t. the natural word length exists under 
some natural assumptions, that is, there is a non-negative constant i such that 

I v- I 
lim — — = £ almost surely. 

n— >-cxD n 

Here, i is called the rate of escape. Furthermore, by |15| follows that i is strictly positive 
if and only if {Xn)nGNo is transient. In [6] there are explicit formulas for the rate of escape 
w.r.t. more general length functions. 

Another characteristic number of random walks is the asymptotic entropy. Denote by 7r.„ 
the distribution of Xn ■ If there is a non- negative constant h such that the limit 

h= lim Elog7r„(X„) 

n—^oo n 

exists, then h is called the asymptotic entropy. Since we only have a partial composition 
law for concatenation of two words (if C C A*) and since we have no subadditivity and 
transitivity of the random walk, we can not apply - as in the case of random walks on 
groups - Kingman's subadditive ergodic theorem to show existence of h. It is easy to see 
that the entropy equals zero if the random walk is recurrent (see Corollary [73]) . Therefore, 
we assume from now on transience of {Xn)n£No- 

Moreover, we assume that the random walk on C is suffix-irreducible, that is, for all w £ C 
with P[Xm = w] > for some m gN and for all at € A"^ there is some n € N such that 



3wi € A* : Xn = wwiab,yk < n : |Xfc| > |u;| 



Xn = w 



>0. 



This assumption excludes degenerate cases and guarantees existence of £; compare with [U 
End of Section 2.1]. At this point let us mention that lim„^oo —n ^og7Tn{Xn) is not neces- 
sarily deterministic: take two homogeneous trees of different degrees di,d2 > 3 equipped 
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with simple random walk; identify their root with one single root which becomes o; then 
the limit depends on the fact in which of the two trees the random walks goes to infinity. 

3. Generating Functions 
For wi,W2 (z C, z ^ C, the Green function is defined as 

n>0 

the last visit generating function as 

L{wi,W2\z) := ^P[X„ = W2,ym G {1, • • • ,n} : X^ / Wi\Xo = wi] • z" 

n>0 

and the first return generating function as 

U{wi,wi\z) :=^P[X„ = wi,yme {I, . . . ,n - 1} : X^ ^ wi\Xo = wi] • z". 

n>l 

By conditioning on the last visit to wi, an important relation between these functions is 
given by 

G{wi,'W2\z) = G{wi,wi\z) ■ L{wi,W2\z). 
Denote by R^ the radius of convergence of G{w,w\z), w (z C. li R^ > 1 then 

G{w,w\l) < -^-^■, (3.1) 

indeed, since G{w,w\z) = (l — U{w,w\z)) it must be that U{w,w\z) < 1 for all 
< z < Rw] moreover, U{w,w\0) = 0, U{w,w\z) is continuous, strictly increasing and 
strictly convex for < z < R^, so we must have U{w, w\l) < 1/Rw which yields ()3.ip . 

In the following we introduce further generating functions, which also have been used in 
[B]. Define for a, b,c,d,e & A and real 2 > 

H{ab,c\z) ■.= ^F[Xn = c,ym < n : \Xm\ > l|Xo = ab] ■ z"- 

n>l 

and 

L{ab,cde\z) := ^ P[X„ = c(ie,Vm G {1, ... ,n} : |X„J > 3|Xo = a6] • z", 

n>l 

G{ab,cd\z) := ^P[X„ = c(i,VmG {l,...,n} : |X„| >2|Xo = a6] -z". 

n>0 

We write L{ab,cde) := L{ab,cde\l). These generating functions can be computed in two 
steps: first, one solves the following system of equations: 

H{ab,c\z) = p{ab,c)-z+ N^ p{ab,de) ■ z ■ H{de,c\z) 

+ Y, p{ab,def) ■ z ■Y,H{ef,g\z) ■ H{dg,c\zy, (3.2) 

defeA^ geA 
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compare with |12| and [U]. The system (|3.2|) consists of equations of quadratic order, and 
therefore the functions H(-, -{z) are algebraic, if the transition probabiUties are algebraic. 
We now get the functions G{ab, cd\z) by solving the following linear system of equations: 

G{ab,cd\z) = 6ab{cd) + N^ p{ab,cidi) ■ z ■ G{cidi,cd\z) + 

+ ^ p{ab,cidiei) ■ z-^H{diei,f\z) ■G{cif,cd\z). 

Finally, we get 

L{ab,cde\z) = y^ p{ab,cdiei) ■ z ■ G{diei,ef\z). 

dieieA'^ 

We remark that we implicitely took into account the assumption C = A*; if C C A* one 
has to restrict these definitions and systems of equations to the terms which may occur. 
Moreover, one can compute the Green functions of the form G{o,abc\z) by solving 

G{wi,W2\z) = (5^„l(w2)+ ^ p{wi,W3) ■ Z-G{W3,W2\Z) + 

W3£C:\w3\<3 

+l3(w;i)- Yl Piwi[2]wi[3],cde)-z-YH{deJ\z)-Giwi[l]cf,W2\z), 

cdeeA^ f€A 

where wi,W2 € C with |ti'i|,|w2| < 3 and l3(u;i) = 1, if \wi\ = 3, and 13(^1) = 
otherwise. 

We also define for ab € A"^: 

aab):= Y^ p{ab,cde)-{l-Y,H{deJ\l)). 
cdeeA^ feA 

This is the probability of starting at a word wab € C, where w (z A* , such that the first 
step goes to a word of length \wab\ + 1 with no further future visits of words of length 
\wab\ or smaller. We define a "length function" on C by 

l{xi . . . Xn) := — log L{o, Xi . . . Xn) for Xi2;2 . . .Xn ^ C. (3-3) 

For n > 5, the terms L(o, xi . . . Xn) can be rewritten as 

n— 3 
y^ L{o,XibiCi\l) Y JJl(6j_iQ_i,Xi6jCj) •Z(6„_3C„_3,Xn-22;n-l2;n); (3.4) 

61C1G.42 b2,...,bn-3eA, i=2 

C2,...,C„-3<^A 

each path from o to xi . . . x„ is decomposed to the last times when the sets A^, A^, . . . , A^~^ 
are visited. 

4. CONES 

In this section we introduce the structure of cones in our setting. A path in £ is a sequence 
of words [wq, wi, . . . , Wm] in >C such that ^wi^i [^i = Wi] > for all 1 < i < ?n,. For n € N, 
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define £>„ := {w £ C \ \w\ > n}. For any w £ C with \w\ > 2, we define the cone rooted 
at w as 

C{w) := {w G ^>\w\ I 3 path [«;, Wi, . . . ,i(;m-i,'W^] with m £N,wi,. . . ,Wm~i S ^>|iu|}- 

By the above made weak symmetry assumption, for wi,W2 £ C with \wi\ = |w2|) we have 
C{wi) = C{w2) whenever there is a positive probability path from wi to W2 in C>\yj-^\. 

By suffix-irreducibility, for all cd € A'^, each cone C{wab), where w £ A* and a6 € ^^, 
has a subcone C{wxcd) C C(i(;) with a suitable choice of x G ^* \ {o}. We say that two 
cones C{wi . . . Wm) and C{yi . . . yn) are isomorphic if C{wm-iWm) = C{yn-iyn), that is, 
two isomorphic cones differ only by different prefixes. In particular, there is a natural 1-to- 
1 correspondence of paths inside Ci^wi . . . Wm) and paths in C(yi . . .yn) where obviously 
each pair of corresponding paths has the same probability. Since the transition probabilities 
depend only on the last two letters of the current word, there are only finitely many different 
cone types up to isomorphisms. We identify the different cone types by two-lettered words 
ab £ A? ^ and write t(C(i(j)) = ah for its cone type, where ah are the last two letters of w. 
For each isomorphism class of cone types we fix some ah representing its cone type. Let 
J C A? be the set of different cone types. The boundary dC{w) of C{w) is given by all 
words too S C{w) with \wo\ = \w\. An important property is the following one: if C{wi) 
and C{'W2) are two isomorphic cones with woab £ dC{wi), then there is wq £ A* such that 
w^ah £ dC{w2)- 

Now we make the non-singular covering assumption that each cone C(w;ao^o)i w £ A* ^ 
aobo £ A'^, contains two proper disjoint subcones, that is, we assume that there are subcones 
of the form C [wwiaibi] , C {ww2a2b2) C C{waobo) with wt £ A* \ {o}, aihi £ A? and 
C{ww\a\h\) n C{ww2a2b2) = 0- We refer to the remarks at the end of this section if this 
property does not hold. The next task is to cover (up to a finite complement) any cone 
C{w) by a finite number of pairwise disjoint subcones Ci, . . . , C^(t„) such that 

r(w) r{w) 

\jT{Ci) = J and C\\JCi\<oo, 

that is, among these subcones every cone type appears. We now show how to construct this 
covering. Suppose we are given a cone C{waQbQ) with w £ A* and ao6o ^ A"^. Inside this 
cone we find subcones of the form C{wwQab) for each ab £ A"^ with suitable wq £ A* \ {o}. 
Furthermore, we can choose these subcones in a way such that they are not contained in 
each other, that is, C{wwiaihi) ^ C{ww2a2h2) for all these chosen cones of all different 
types: indeed, since we assume existence of a non-singular covering of C{w) by subcones 
one can walk from w inside C-^\^\ to words wwiaibi and ww2a2b2, where wi,W2 £ A*, 
aihi £ A? and C(ww\a\b\) n C{ww2a2h2) = 0- Then we have found a subcone of type 
r(C(ai6i)), and we search for other cone types in the subcone C{ww2a2h2)- Obviously, a 
subcone in C{ww2a2h2) does not intersect C{wwiaibi). Iterating this step leads to subcones 
in C{w) of all different types which do not intersect each other. After we have found non- 
intersecting subcones of all types in C{w) we cover this cone by further subcones, which 
are not intersecting the above chosen subcones, such that the difference of C{w) and the 
union of subcones is finite. This is, for instance, done by taking all cones rooted at words 
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V G C{w), where v is at the same distance (that is, minimal length of a path)to dC{w) as 
the subcone of maximal distance to w and where v is not contained in any of the above 
chosen subcones yet. See Figure [TJ 




Figure 1. Covering of cones by subcones: the numbers represent the dif- 
ferent cone types; the cones with entire boundary lines belong to the cov- 
ering. 

Let us remark that, for each cone type, we fix such a covering, such that the covering of 
C{w) does not depend on the choice of the specific root w on the boundary of C{w): fix a 
covering for C{ab), ab G A'^; ii w = WQaibi G C with t(C{w)) = ah then we fix the covering 
of C{w) = C{wQab) which is inherits the covering from C{ab). This is well-defined since 
the covering of a cone depends only on the relative location of its subcones in its interior. 

We can also cover C (up to a finite set) by a finite number of non-self-containing subcones, 
where each cone type appears. To this end, we just apply the algorithm explained above 



(0) 



CnJ the covering 



and take cones of the form C{w) with \w\ > 2. We denote by C| 
of C, which contains all types and whose complement is finite. 

Now we explain how to proceed if every cone contains no two disjoint subcones. This case 
may, in particular, occur if £ is a proper subset of A*. For ab,cd G A"^, observe that 
cd G C{ab) if and only if ab G C{cd). This implies that C{w) = {v (£ C \ \v\ > \w\} and, in 
particular, that there is only just one cone type. We can then cover C{w) by the subcone 
C{wi) for any wi & C with \wi\ = |u;| -|- 1 and p{w,wi) > 0. One can show that in this 
case the random walk converges almost surely to a deterministic infinite word and that the 
support of the random walk is a proper subset of A*. In order to see this, assume that the 
random walk tends with some positive probabilities to some infinite words with prefixes 
wabc and wdef, where w G A* , a,b,c,d,e, f € A with a ^ d. Since C (wabc) (IC {wdef) = 
it must be that the random walk enters either C{wabc) or C{wdef) on its way to infinity 
due to the assumption of singular covering. That is, the letter a is deterministic, and by 
induction the infinite limiting word is deterministic 
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We call the random walk expanding if each cone contains two disjoint subcones. The results 
below depend not on the fact if the random walk is expanding or not. At the end, however, 
we will see that the non-expanding case leads to zero entropy. 

5. Exit Times 

In this section we prove a law of large numbers, which turns out to be the asymptotic 
entropy in the later section. For this purpose, we define exit times (compare with [6]), for 
which we derive a law of large numbers. Throughout this section, we use the following 
notations: wq, wi,W2 £ A* \ {o} and a, b, c, d, oi, 61, 02, 62, • • • € ^. 



5.1. Exit Time Process. We define the following exit times. Let eg be the last time 

at which the random walk visits ljr=i ^^i ^^'^ stays in one of the cones C[ , . . . , CnJ 
afterwards forever, that is, 

eo := sup{m G No | X„ e acf \ Vn > m : X„ € cf ^ \ dcf^ for any i e {1, . . . , no}}. 
Inductively, if X^f, = w and C{w) has a covering (determined only by the type of C{w)) 
consisting of subcones C} , . . . , CVn as explained in Section HI then 

efc+i := sup{?n > e^ | X^ G dd^''\yn > m : X„ G cf ^aC7f ^ for any i G {1, . . . , r{w)}}. 

Observe that X„, n > e^, has the prefix wq if Xe,. = Woab. Define the relative increment 
between two exit times as follows: set Wq := ^eoJ for A; > 1: if Xe^_j = w^ab and 
Xej. = WQWicd, then W^ := wicd for k > 1. Since we have only finitely many different 
cone types and the subcones of coverings of any cone C are nested at uniformly bounded 
distance (w.r.t. minimal path lengths) to dC, the random variables Wfc can take only 
finitely many different values. 

For X G £, define 

r{x) 

S{x) := \J dCi, 

where Ci, . . . , C^-m is a covering of C{x). Furthermore, define for x = xi . . . x^ G £ and 

'^n+d ^ ^(^) with d = d{x, y) := \y\ — \x\ 



y = xi. 


■ Xn-2Xn-^iX^ . . . X 


IL(x,y) 


:= yy Xn = y 

n>0 




= >; 



y, Vm G {1, . . . , n} : X„ G C{x) \ dC{x) 



Xn 



L{xn-iXn,yi) •L(yi[2]yi[3],y2) • . . . • L(yd„i, x;+^„2<+rf-i^n+d) 



Obviously, ]L(x,y) depends on x only by its last two letters. 

Proposition 5.1. The process (W^) , ^ is a positiv recurrent Markov chain with transition 
probabilities 

g(x,,):=|iS^(^'^)' 'fy^^^-^^ 

lO, otherwise. 
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Proof. Let be wq, . . . , iffc+i £ ^* \ {o} such that Wi^i € S{wi) for all i € {0, . . . , k} and 
wq G Ufci "^C- . We set xq := wq and inductively: if x^^i = Vk-iO-k-ibk-i with yk~i G -4.* 
and afc_i6fc_i G ^^ then set x^ := y^-iWh- Then: 

P[Wo = Wo,...,Wk = Wk] = P[Xeo = Xo, . . . ,Xe, = Xfc] 

= G(o,u;o|l) ■Ij{wo,wi)L{wi,W2) • ... • ]L(t(;fc_i,u;fc) •C(['Wfc]). 
Consider 

IP[Wfe+i = tufc+i I Wo = wo,.. . , Wfc = Wk] 
_ F[Wo = 'Wo,...,Wk = Wk,Wk+i=Wk+i] 



¥[Wo = wo,...,Wk=Wk] 
G{o,wo\l) ■L{wo,wi)'L{wi,W2) • ... • ]L(wfc-i,u^fc) • IL(u;fc,iffc+i) -CCl^^fe+i]) 



lix,y)- 



G{o,wo\l) ■Ij{wo,wi)Ij{wi,W2) • ... •]L('u;fc_i,u;A;) ■ $,{[wk]) 
Since there are only finitely many different values for W^ positive recurrence follows due 
to suffix- irreducibility, which implies irreducibility of the process (W^),^,. D 

The random variables W^, k > 1, can take values in 

Wo := {w G X|P[W2 = u; I Wi = WQab] > for any wq G A*,ab G A^}. 
Observe that the transition probabilities depend on x only by its last two letters. 
Lemma 5.2. We have supp(P[Wi = •]) = Wo. 

Proof. Let y = aibiWya2b2 G Wo with Wy G A* (we omit the special case y = 010262 which 
follows analogously). Then there is aibi G A'^ with ]L(oi6i,y) > and ^(02^2) > 0. By 
construction of our coverings there is some wq G A* with u;oai5i G Ur=i ^W • Choose 
771 G N such that p^^>{o,woaibi) > 0. Then: 

P[Wi = y]> p^'^^o, woaih) ■ L(ai6i,y) • ^(0262) > 0. 

D 

For sake of better identification of the cones, we now switch to a more suitable repre- 
sentation of cones and coverings. We identify the different cone types by numbers I := 
{1, . . . ,r} C N. If C{w) is a cone of type i, then the covering of C{w) has Uj subcones 
of type j. We denote these subcones by Cj. ^ = Cj. ^ (w), . . . , Cj. ^ = Cj^ ^ ^ (w) or identify 
them just by jj^i , . . . , jj^„ . , which correspond to the cones of type j with different locations 
inside C{w). We will sometimes omit the root w in the notation of the subcones when it 
will be clear from the context and only the relative positon of a subcone in some given 
cone will be important. If T(C(Xej._-^)) = i and X^^. G dCj. ;(Xej._j), then we set i^ := ji^i. 

At this point we recall the relation between W^ and X^^: if X^q = u/oOo^o and Wi = 
wioibi then Xe^ = uiQWiaibi; in general, if ^e^.i = wat-ibk-i and W^ = WkOkbk then 
Xej, = wwkdkbk- That is, there is a natural bijection of trajectories of (^k)k&] E^nd 
(Xej.)fcgN. In particular, the value of W^ determines the value of ifc uniquely. For a better 
visualization of the values i^ := jn, see Figure [2l In other words, the random variables 
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Figure 2. Numbering of sub cones 



ifc collect the different cones which are entered successively by the random walk {Xn)ne'No 
on its way to infinity, while the W^'s keep, in addition, the information where the single 
sub cones are finally entered. 

For s,t € Z, let n{s,t) denote the number of different cones of type t in the covering of a 
cone of type s. Define 



W 



{jr. 



j,meI,T{Cix[l]x[2]) =m,T(C(x)) = j, 
1 < n < n{m,j),x e dCj^Jx[l]x[2]) DWq 



In other words, {jm,n,x) G W if x = aobowoab G Wq with r(C(ao6o)) = "t- and C{x) is the 
n-th cone of type j inside C(ao6o)- Furthermore, define 

W^ := {{s,tn)\s,t GZ,1 <7l<7l{s,t)}. 

That is, tn corresponds to the n-th cone of type t in a covering of a cone of type s. 

The process ((ifc) W/j)), ^^^ with state space W is also a positive recurrent Markov chain 
since the values of i^ are uniquely determined by the values of W^ and the process {Wk)keN 
is a Markov chain. Moreover, for {ik,hWk-i)jiJm,n,Wk) £ VV, the transition probabilities 
are given by 



(ife,Wfc) = {jm,n,Wk) (ife-l,Wfe 



[tk,l,Wk-l} 



{Wk-l,Wk), 



if m = i, 
\i m ^ i. 



In particular, the transition matrix of ((ifc) W^)), p, has zero entries. In order to apply the 
result of |10| Theorem 1.1] for getting the analytic behaviour of the entropy later we have 
to adapt the Markov chain in order to obtain a transition matrix without zeroes. 

The process (ifc)fcgN is, in general, not a Markov chain because it can be seen as a projection 
of the process {yVk)k&]- 
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'n-{iik,hWi),{jra,n,'W2)) : = 



Define the following projection for {ik,hWi), (jm,n)'"^2) ^ W: 

ii,ji,n) =■■ (ijn), ifm = z, 

Here, j/ represents the l-th cone of type j in a cone of type i, namely the cone represented 
by ji,i- We now define the hidden Markov chain (Yfc)fcgpj by 

Yfc:=7r((ifc,Wfe),(ifc+i,Wfc+i)). 

In other words, (Yk)kGN traces the way to infinity in terms of which subcones are entered 
successively without distinguishing which of the hit boundary points are the exit time 
points Xe^. 

Morever, observe that supp(P[Yi = •]) = Wn since - by construction of the coverings 

- for any (s,t„) € Wvr there is Woab € {J^lidCj^ and x e Wq with t{C{x)) = s and 
P[Xei = wqx I Xe,) = Woab] > and there is y € dCt^„{[x]) with q{x,y) > such that 

P[Yi = (s, i„)] > P [Xe„ = woab, Xe, = wox, W2 = y] 

> P[Xe(, = Woab] ■ P[Xei = wqx \ Xe^ = woob] ■ q{x, y) > 0. 

5.2. Modified Exit Time Process. The aim of this subsection is the construction of a 
Markov chain related to the exit time process (i^, Wfc)^^^ such that the transition matrix 
has strictly positive entries and the modified process leads under n to the same hidden 
Markov chain for almost every trajectory. 

Consider the two subcones Cj . ^ C C{aibi) and Cj^. ^ C (^(0262) belonging to coverings of 
the bigger cones with r(C(ai6i)) = i and T{C{a2b2)) = k. Assume that yoab G dCj^^. 

Since both cones are isomorphic, there is unique yo = ^q '"'''^ S A* such that y^ab S dCj^ ^ ; 

see Figure [3l In the following we will use this notation yo = y^'^ for describing this 




Figure 3. Prefix replacement 

replacement. 

For i,j G T and ab G J? with T{C{ab)) = j we write 

#{JM \s^i}= \{{js,uxab) G W\s G X\ {i}, 1 < t < n(s,j)}| 
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which is independent from the specific choice of ab. Let {ik,i,x), (jm,n)?/o^^) S W. Define 
the following transition probabilities on W: 






aab) 



f^I.{x,yoab), 



l{{'i'k,hx),{jm,n,yoab)) := < |^]L(x,2/oa&), 



if 7TT- = z A n = 1, 
if 771 = i A n > 2, 



Observe that the transitions depend on x only by its last two letters. It is easy to see that 
these transition probabilities define a Markov chain (inherited from the process (i^, W^j^gp^) 
each step from {ik,ux) to {jm,n^yooh) either behaves according to (/(•,•) (case m = i and 
n > 2) or steps from {ik,i,x) to {ji^i,yoab) (when seen as a step of the process (i^, Wfc)fcgi^)) 
are split up into different equally likely paths {ik,ux) to {jm,n-,yoO-^) with m ^ i or 
m = i An = 1; since q{-, •) depends on its first argument only by i (and not by k and /), it 
follows from q{-, •) that q{-, •) describes also a random walk. Moreover, the corresponding 
transition matrix has strictly positive entries. By suffix-irreducibility and Proposition 15. H 
the matrix Q = [q{{ik.ux), {jm,n,y))) is stochastic, and governs a positiv recurrent Markov 
chain {ikj^k)keN with invariant probability measure i'. The initial distribution of (ii,xi) 
is given hy fti, defined as 

f^i{in,,n,x) := P[(ii, Wi) = {im,n,x)] = F[t{C{X^,)) = ?n, Wi =x]>0 

for {im,n,x) G yy. If we equip the process with the invariant probability measure v as 

'fcGN' 



initial distribution we write (^v^ ,Ji.j^ \ 



Then the process ((ifc,Xfc), (ifc_|_i,Xfc+i)), j^ is again a positiv recurrent Markov chain with 

transition matrix Q2 (arising from Q) and invariant probability measure denoted by 1/2- 
Once again, if we equip this process with the initial distribution 1/2, which arises in a 
natural way from v, then we write ((i^'^ , x^*^ ), (i^'V|,x^'^-^)), . 

We now define two hidden Markov chains {Zj^ )k£n and (Yfe)fcgi^ by 

Z^^^ :=vr((i(^),x('^)),(ig,,xg,)), Y, := vr((i,,x,), (i,+i,x,+0)- 

That is, {Z^ )keN and (Yk)keN differ in their evolution only in their inital distributions. 
The crucial point now is the following proposition: 

Proposition 5.3. For a// (s^^), t^^)), . . . , (s("),i(")) eW^, 

P[Yi = (s«,tW),...,Y„ = (s("),t("))] =P[Yi = (s«,t«),...,Y„ = (sW,tW)]. 

Proof. We prove the claim by induction on n. First, let j,s E I and t^^^ = jm with 
2 < m < n{s,j), and let aQbQ,ab € A"^ with T(C(ao6o)) = s and T{C{ab)) = j. If Cj^m is 
the m-th cone of type j in the covering of C(aofeo) then there is unique xq = Xq '"'''"'" E A* 
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with XQab G dCj^rn- With this notation we get: 

P[Yi = (s,j„,),[W2]=a6] = Yl P[(ii,Wi) = (sfc,,,x)]-g(x,xoa6) 

(uk,i,x)&A!:u=s 

= ^ h{sk,hx)q[{sk,hx),{is^m,XQah)) 

(wj. i,x)(iV\>:u=s 

= P[Yi = (s,i„),[x2]=a6]. 

Now we turn to the case t^^> = ji. Once again, if Cj i is the first cone of type j in the 
covering of C{aobo) then there is unique xq = a^Q "'' '"" G A* with x^ab G dCj^i. We get: 

P[Yi = (s,Ji),[x2] = a6] 

= X] f^i{sk,hx) q{{sk,i,x),{ts,i,xoab)) + ^ q{{sk,i,x),{tp^g,yab)) 



(«(; i,x)gW:u=s 

^ fll{Sk,l,x) 

{ukj,x)eW:u=s 



(tp,,,yaibi)eW: 
p^s,aibi=ab 

l{{sk,i,x),{ts^i,xoab)) ^ 

#{tfc,z \k^s} + l ^ 

{tp,q,yaibi)eW: 
p^s,aibi=ab 



q{{sk,i,x),(ts,i,xoab)) ' 



Now, 



Y, F[(ii, Wi) = (sfc,,,x)] • q{x,Xoab) = P[Yi = (s,ii), [W2] = ab]. 

{uk,i,x)eW:u=s 

P[Yi = (.,tW)] = Y. ^P[Yi = i^^t^'^)^ [X2] = ab] 

abeA'^ 

= Y IP[Yi = (s«,t(i)),[W2]=a6] =P[Yi = (s«,t«)]. 

abGA'^ 

We now perform the induction step where we will use the equations from the initial step 
as induction assumptions. First, consider the case t("+^) = j„ with m > 2; then for all 
aobo,ab G A'^ with r(C(ao^o)) = s^"'~^^' =: s and T(C(a6)) = j there is unique xq = 
^|sj,"i,a G A* with xofife G dCj^ ^{a^bo). Since we have an underlying Markov chain we 
obtain: 

P[Yi = (5«,i«),...,Y„+i = (5("+^),i("+i)),[x„+i] = ao6o,[x„+2] =a6] 
= Y Y. P[Yi = (s('\t(')),...,Y„ = (n,Sfc),[x„+i] = ^oaofeo] 

(u,iJfc)g>V,r:«'oG.4* 

■q[{sk,WQaQbo), {js,m,xoab)) 
= P[Yi = (s«,i«),...,Y„ = (5("),i(")), [x„+i] = aobo]4^Haobo,xoab) 

= P[Yi = (s«,i«), . . . , Y„ = (s("),i(")), [W„+i] = aobo]^^Maobo,xoab) 
= P[Yi = (5«,i«),...,Y„+i = (s("+i),t("+i)),[W„+i] = ao6o,[W„+2] =a6]. 



14 LORENZ A. GILCH 

Now we turn to the case t("+^) = ji. Once again, if Cj^i is the first cone of type j in the 
covering of C(ao6o) (of type s) then there is unique xq = Xq "'' '"" G A* with XQab € dCj^i. 
We get by distinguishing whether t("+^) = ji arises from in+2 = js,i or in+2 = jk,l with 

F[Yi = (s(i),t(i)), . . . , Y„+i = (s("+^), ji), [x„+i] = aobo, [x„+2] = ab] 

(up^q,w)<=:W:lw]=aobo 

j{{up,q,w),{js,i,xoab)) + ^ q{{up,q,w),{jk,i,y)) 

(ifc,!,J/)6W: 
t=j,fc^s,[y]=afe 



'[Yi = (s«,t«),...,Y„ = (.W,tW),[x„+i] = 

^{ab) Ij{aobo,xoab) y—^ Cio-b) Ij{aobQ,xoab) 



1- E 



e(aobo) #{ifc,« I A; y^ s} + 1 ^ ^(0060) #{ifc,i | /e / s} + 1 

t=j,p^s,[y]=ab 

= P[Yi = (sW,tW),...,Y„ = (5("),t(")),[x„+i] = ao6o]-^^]L(ao6o,xoa6) 

Uaooo) 

U«oOo) 
= P[Yi = (s(i),t(i)), . . . , Y„+i = (s("+^), ji), [W„+i] = ao6o, [W„+2] = ab]. 

Finally, we obtain: 

P[Yi = (sW,tW),...,Y„+i = (.("+i),t("+i))] 

^ P[Yi = (s«,t«),...,Y„+i = (s("+i),t("+i)),[x„+i] = aofeo,[x„+2] = a6] 

5^ P[Yi = (5(^),t(^)), . . . , Y„+i = (s("+i),t("+i)), [W„+i] = aobo, [W„+2] = aft] 

= P[Yi = (s«,tW),...,Y„+i = (.("+y,t("+i))]. 
This finishes the proof. D 

The statement of the lemma can be said in other words: the process governed by Q can be 
seen as a exit time process, where one has more subcones to enter (namely, the subcones of 
indices j^^i, k ^ i, when being currently in a cone of type i), but under the projection vr folds 
the process down to the same hidden Markov chain {Yk)kGNj and it does not distinguish 
if ife = ji,i or ifc = jm,n, m / i. 

Hence, the Markov chains ((ifc, W^.), (i^+i, Wfc+i))^^^^ and ((ifc,Xfc), (ifc+i,Xfc+i))^,^j^ lead 
to the same hidden Markov chain in terms of probability. The important difference is 
that the transition matrix Q has strictly positive entries, while this must not hold for the 
transition matrix of the chain ((i^, Wfc), (ifc+i, Wfc_|_i)), p.. 
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5.3. Entropy of the Hidden Markov Chain related to the Exit Time Process. In 

this subsection we derive existence of the asymptotic entropy of the hidden Markov chains 
(Yfe)fcgN and (Yfc)^^^. 

First, consider the hidden markov chain (Zjt)fcgN: this process is stationary and ergodic 
since the underlying Markov chain (i^'' ; x^'^'^ ") is stationary and ergodic. Hence, there 
is a constant H{'L) > such that 

hm -- logP[Zi = si, . . . , Zfc = Sfc] = H{Z) 

A;->oo K 

for almost every realisation (s^, £2, . . . ) G W^ of (Zfc)^^!^; see e.g. Cover and Thomas [U 
Theorem 16.8.1]. We now deduce the same property for the process (Yk)k&n- 



Proposition 5.4. For almost every realisation (s^^, £2, . . . ) S W^ of (Yk)k&l, 

lim — — 

fc— >-oo K 



lim -ilogP[Yi = Si,...,Yfc = Sfc] =H{Z). 



Proof. The processes (Zfc)fcgp^ and (Ykjk&i differ only in the inital distribution. Moreover, 
there are constants c, C > such that 

C • fj'l(}m,ni X) < v(}m.,ni x) < G ■ ill[im,m ^J 

for all {im,n,x) € W. We now get for almost every trajectory (s;^, £2, . . . ) G W^ of (Yk)ke'N' 
lim -ilogP[Yi=Si,...,Yfc = S;,] 
= J™ "I^^S Y^ lP[(ii,xi) = y^,...,(ifc+i,Xfc+i) = y ] 



lim -ilogP[Zi = Si,...,Zfc = Sfc] =/^(Z). 

K — ^00 fC 



As a consequence we obtain the next statement: 
Corollary 5.5. 

lim -I /logP[Yi = Si,...,Yfc = Sfc]dPUi,S2,...) = i/(Z). 
fc->-oo k J 



n 
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Proof. Since |W| < oo by construction, there is eg > such that 

1 > Qiy 1,112) ^ '^o for all 2/^,^2 € W. 
Hence, 

0<-^logF[Yi = s„...,Yk = Sk] <-^logc-logeo, 

where c = min|/ti(y) | y € W}. Therefore, we may exchange integral and limit, which 
yields the claim. D 

Let X = xi . . . Xn (z C, n = \x\ > 2, he on the boundary of some cone C. Define 
l{xi . . . Xn) := - log ^ L(o, xi ...x„_26c). 

bc£A'^.xi...Xn-2bcSdC 

Proposition 5.6. 

i(x ) 

lim — -^^ = H{Zi) almost surely. 

Proof. Assume that Wj = yjUjbj, where yj S A* and Ojbj S A"^ with < j < k. That 
is, Xg. = yiy2 • • ■UjO'jbj- By definition, Xg. is on the boundary of some cone, which we 
denote by Cj. 

Let j G I. Recall that the covering of C consists of uq subcones. Each of these subcones 

C^ has again a covering consisting of rij subcones of type j. Write Nj := Y17=i '^^j ^^'^ ^^ 

denote by C- ^ these different subcones with 1 < k < Nj. Furthermore, we write w ~ yab 

if It; = ycd € dC{yab) for y G A* and ab, cd € A^, that is, w ~ yab if lu lies on the same 
boundary of a cone as yab (namely the cone C{yab)). 

Moreover, we have for all j Gl and wiab,W2ab € IJj=i ^^7 fc *^^^* IPi-'^ei = wiab] > if 
and only if P[Xej = W2ab] > 0. Therefore, there are c, C > such that 

c • P[Xei = W2ab] < P[Xei = wiab] < C ■ F[Xe, = W2ab] 

for all wiab,W2ab € Ui=i "^C* fc ■ Assume now that r(C(ai6i)) = j £ X. Observe that 

-.(1) 



dC{yoyiaibi) = {yoyicidi, ..., yoUiCKd,.,} implies that C) I has the form {wcidi, ..., wCk^k} 



for some suitable w € A*. We have: 



iVj • ^ ^ F[Xe-,='Wi]q{yi[wi],W2) ■q{w2,W3) ■ ...-q^Wk-i^Wk) 



■wiGC: ui2,...,uij,GVVo: 
■i^l^i/OJ/iaifci Wi^yiaibi 



g(M) - 

e(K])' 

C-P[Yi = (i,i«),...,Yfc_i = (sfc_i,t('=-i))], (5.1) 



fe=l «iiG£: i02,.--,«'fc£VVo: 
wi^yoyiaibi Wi^yiaibi 

< C-^ ^ ^ "F[Xe^ = Wi]-jj^^l.{wi,W2) ■ q{w2,W'i) ■ . . . ■ q{Wk-l,Wk) 
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where the values of S2) • • • j •5fc-i and t'^', . . . , t^'^ ^> are determined by the values of Wj 
yjajbj. Analogously, 

^i ' "12 X] ^Xe^ =wi]q{yi[wi],W2) ■q{w2,W3)- ...■q{wk-i,Wk) 



wi^yoyiaibi Wi^yiaibi 



> c-P[Yi = (j,t«),...,Yfc„i = (sfe_i,t(*^-i))]. (5.2) 

Recall that G{o,w) = G{o,o)L{o,w) for all w (z C and that ^(•) can only take finitely 
many values. Writing X^^ = xi . . .Xn and j = r(C(Xei)), we now can conclude as follows: 



fc— 5>oo k 

lim — — lo^ 

fc— 5>oo k 



lim — — lo^ 

fc— 5>oo k 



lim — — lo^ 

fc— 5>oo A; 



y^ L{o,xi...Xk-2bc) 



bceA'^:xi...Xn-2bcGdCk 



X] X] -J^(o,w;i)L(u;i,'u;2) • ••• •lL(^«fc-i,Wfc) 



wir^yoyiaibi Wir^yiatbi 



E 



J^ G(o,wi)«[u.,])^|nMi(«.i,»2; 



■^i^J/OJ/iaifei Wi-^yiaibi 



e(K])" 



e(K]) 
e([«^fc-i]) 

lim — — log 

fc— 5>oo K 



lim — — lo^ 

A;->oo k 



L{wk-i,Wk) 



y^ y^ P[Xei =-u;i]g'(yi[i(;i],'u;2) • ■■■ • ^(iffe-ijW^fc) 



wi^yoyiciibi Wir^yiaibi 



^j X] X ^Xe^=wi]q{yi['Wi],W2)-...-qiwk-i,Wk) 






= hm -ilogP[Yi = (j,t«),...,Yfc_i = (sfc_i,i(*^-i)] =/?(Z). 

The last equation follows from (j5.ip and (j5.2p . We need those important estimates since 
the first coordinate of Yi describes only the cone type of Xe^ but there may be several 
cones of the same type j = r(C(Xej)). 

D 

Recall the definition of /(xi . . . x„) = — log L(o, xi . . . Xn) for x = xi . . . x„ S /I. 

Corollary 5.7. 

liX ) 
lim — -^^ = H{Z) almost surely. 

n— >oo k 

Proof. It suffices to compare ^(Xej.) with l{Xei^). Assume for a moment that Xe,. = xi . . . x„ 
and that X^^ is on the boundary of the cone C. Then, the probability of walking inside 
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C from xi . . .Xn G dC to any xi . . . x„_2a6 G dC can be bounded from below by some 
constant Eq, because the probabilities depend only on Xn-iXn and ab G J? . Therefore, 

L(o,XeJ-eo < X] L{xi...Xm-2ah), 

hcGA'^:xi...x„-20-b(:dC 
y~^ L(o,Xi ...X„_2a&) -eo < l-^^l • ^(Oj-'^eJ- 

Taking logarithms, dividing by k and letting k tend to infinity yields the claim. D 

Now we come to an important law of large numbers. For this purpose, define d{x, y) = 
\y\ — \x\ for x,y ^ C with |2;| < |y|, where | • | is the natural word length. Denote by vq the 
invariant probabilty measure of i^kjk&n and define 

A := ^ UQ{x)-q{x,y)-d{x,y). (5.3) 

x,yeWo 

Then: 

Proposition 5.8. 

liX ") 

lim — = i ■ X~ ■ H(Z) almost surely. 

k^oo n 

Proof. Define 

e^ := supjrTi G N |Xm| = A;}. 

Transience yields e^ < oo almost surely for all k G N. Define the maximal exit times at 
time n G N as 

k(n) := max{fc G N | e^ < n}, 
t(n) := maxjfc G N | e^ < n}. 

Obviously, k(n) > t(n) and each exit time e^ corresponds to exactly one e^ with I > k. 
First, we rewrite 

IjXn) ^ l{Xn) - IjX^^fJ ^ KX,,^J ^ t(n[ ^ kH _ ek(^ .^^. 

n n t(?i) k(n) ek(„) n 

Let £i be the minimal occuring positive single-step transition probability. Since the sub- 
cones of coverings of bigger cones are nested at bounded distance we have e\r_t^\ > et(„) > 
^k(n)-D for some suitable D G N. The first quotient on the right hand side of (15. 4p tends 
to zero since 

L{o,Xn)-£i < -L(o, Xe^,, ) (due to weak symmetry), 

L(o,Xe,(„,)er'*'"' < L{o,Xn) 
and due to (follows completely analogously as in \1G\ Proof of Theorem D]) 

— ^—^ < ^—'- > 1, — ^—^ > ^— > 1, 

n n n n 

which in turn yields (n — e^/^))/?! ^> as n — t- oo. 
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By Corollary 15.71 l{Xg^ )/t(n) tends to -ff(Z). On the other hand side, e^/k tends almost 
surely to 1/i and ek(n)/'^ tends to 1 almost surely; see [H Proposition 2.3]. It remains to 
prove that the limit limfc^ook(n)/t(?i) exists. Clearly, 

kf 1 1 *^"''~^ 

t(S = IM ^"^^"^ ^^^^ ^ + ^ '^^^^^ ' ^^'^^ ^ + '^^^^'<"' ' ^'^^- 

i=l 

Note that 

for a suitable constants Di and D2- Thus, it is sufficient to consider 

k 
j=l 

Since (i(Xe. ,Xe._^i) can be computed from Wj and Wj+i, we may apply the ergodic 
theorem for positive recurrent Markov chains on the process ((Wj, Wj+i)) p, which yields 
almost surely 

1 ^ 

i=l x,y)eWo 

This finishes the proof and gives the proposed formula. D 

6. Existence of Entropy 

We follow the reasoning of for the proof of existence of the entropy. First, we need the 
following lemma: 

Lemma 6.1. There is R > 1 such that G{o,w\R) < oo for all w (z C. 

Proof. A simple adaption of the proof of |12| Proposition 8.2] shows that G{v,w\z) has 
radius of convergence R(v,w) > 1. At this point we need suffix-irreducibility. With the 
help of this fact we are able to prove the lemma in several steps: 

(1) There is Rq > 1 such that L(o, abc\Ro) < oo for all abc € A"^: this follows from the 
inequality G{o,'w\z) > L{o,w\z). 

(2) There is i?i > 1 such that G{ab,cd\Ri) < oo for all ab,cd G A'^: this follows from 
the inequality G{ab,cd\z) > G{ab,cd\z). 

(3) Since for a, b,c,d,e & A 

L{ab,cde\z) = y^ p{ab,cdiei) ■ z ■ G{diei,de\z), 

we have L{ab,cde\Ri) < oo. 

(4) By G{o,w\z) = G{o,o\z)L{o,w\z) and Equation (j3.4p . we get G{o,w\R) < oo for 
all w ^ C, where R = min{i?(o, o), Ro,Ri} > 1. 

This finishes the proof. D 
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In the following let g G [1, R). 

Lemma 6.2. There are constants Di and D2 > such that for all m,n gNq 

Proof. Denote by Cq the circle with radius g in the complex plane centered at 0. A straight- 
forward computation shows for m G Nq: 

1 / m'^^ J li if m = 0, 
2^/c/ ~~ jo, ifm/0. 

Let w = wi . . .Wt ^ C An application of Fubini's Theorem yields 

e k>0 



, G{o,w\z)z-'^— = (h Yp^^Ho,w)z''z~'^ 

27ri Jq z 27rz J^., f-—; z 



TTZ ■'^— ' ]c Z 



k>0 

Since G{o,w\z) is analytic on Cg, we have |G(o, t(;|2)| < G{o,w\q) for all \z\ = g. Thus, 

p'-"'\o,w) < ^ ■ g^""-^ • G{o,w\g) • 2TTg = G{o,w\g) ■ g-"^ . 
zvr 

Set L := 1 V max|Z(a6, cde\g) \ a, 5, c, d, e G ^} and Dg '■= G{o, o\g) ■ J2abceA^ ^^^^ abc\g). 
An application of Equation (|3.4p provides for t > 3 

G{o,w\q) = G{o, o\g) ■ L{o, wi...wt)<Do- l^^^*"^) • L*-^. 

Set Di := Dq y xa.ax.{G{o.,w\g)\w G >C, |u^| < 2}. Since |X„| < n, we obtain by setting 

D2:=\A\''-L 

p^'^Xo.Xn) < Di ■ |>1|2*L* • g-"' < Di • |^|2«L" • ^-™ = Di ■ D^ ■ ^"™. 

D 

The following technical lemma will be used in the proof of the next theorem: 

Lemma 6.3. Let {An)neN) (fln)nGN; (bn)nGN ^6 sequences of strictly positive numbers with 
An = ttn + bn- Assumc that lim„_j.oo — log^n = c G [0,00) and that WvUn^aobn/ff' = 
for all q G (0, 1). Then lim„_i.oo —y^ log a^ = c. 

Proof. A proof can be found in [3 Lemma 3.5]. D 

Lemma 6.4. For n G N, consider the function /„ : £ — ?> M defined by 

f r,„^ .- /-^iogE:f=oP^'"no,«^), ^/p("Ho,^) > 0, 

Jn{W) — \ +L • 

I U, otherwise. 

Then there are constants d and D such that d < fn{w) < D for all n (^N and w G C 
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Proof. Let w (z C and n £ N with p^^>{o,w) > 0. Denote by R the radius of convergence 
of G{w,w\z). By Inequahty (13. ip . we get 

^p^'^\o,w) <G(o,u;|l) =F{o,w\l)-G{w,w\l) < —^, 
that is, 

fn{w) > -- log Y ^ - log -, r- 

For the upper bound, observe that w (z C with p(") (o, w) > can be reached from o in ?i 
steps with a probabihty of at least Sq, where 

eo := vLii\i{p{wi,W2) I tt'i,?y2 G ^*,p(t(;i,W2) > 0} > 

is independent from w. Thus, the sum '}Z^=qP {^t"^) ^^^ ^ value greater or equal to Eq. 
Hence, /„(x) < -log eo- □ 

Now we can finally prove: 

Theorem 6.5. The asymptotic entropy h of {Xn)neNo exists and equals h = £■ A~^ • H{Z). 

Proof. We can rewrite £ ■ A~^ • H{Zi) as 

l-X-^-HiZ) = [ i-X~^ ■H{Z)dF= [ Urn --l{Xn)dF 

J J n— >oo n 

= / lim log L(o,Xn{uj)\l)dF{uj) 

J n— >cxi fi ^ ' ' 

= [ lim -ilog^%-^4TV^'^IP('^) = / 1™ --logG(o,X„(a;)|l)dPM 
J n^oo n G[o,o\l) J n-^oo n ^ ' 

Since 

G(o,X„|l) = ^p(™)(o,X„) >p(")(o,X„) =7r„(X„), 

m>0 

we have 

t-X 



-^■H{Z)< /"liminf--log7r„(X„(a;))dP(6j). (6.1) 

J n-i-oo n 



The next aim is to prove limsup„_j.Qo — ;iE[log7r„(X,i)] < h. We now apply Lemma lOl by 
setting 

2 

An := Y,P^"'Ho,Xn), an := Y,P^"'Ho,Xn) and 6„ := J] p(™Ho,X„). 

m>0 m=0 m>n2+l 

By Lemma [6.21 

-n^-l 
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Therefore, 6„ decays faster than any geometric sequence. Applying Lemma [6.31 yields 

l-X-^-hiZ)= lim - i log V p^"^) (o, Xn) . 
By Lemma [6.41 we may apply the Dominated Convergence Theorem and get: 
i-\~^ ■H(Z)= [ l-X-^-H(Z)dF = /" lim --log Vp(™Vo,X„)dP 

/ / n— >-oo n ^ — ' 

•^ -^ m=0 

= lim /"--log Vp(™)(o,X„)dP 
71— 5>oo / n ^ — ' 

•^ m=0 

= lim y^p("Vo, ■w)log y^p^™Vo, -«;). 

n— 5>oo n ■^^ — ' ■^^ — ' 

iy6£ rn=0 

For tt; € £, set 

1 "' 
n^ + 1 '^^ 

m=0 

Recall that Shannon's Inequality gives 

- E p^"^ (o, u;) log /u(w) > - E p^"^ (o, w) log p^"^ (o, w) 

for every finitely supported probability measure fj, on C We apply now this inequality on 

i-\-^ ■ H{Z) > limsup- Vp(")(o,u;)log(n2 + 1) - - V /")(o,u;) logp(")(o,u;) 

= lim sup / log7r„(X„) dP. 

Now we can conclude with Fatou's Lemma: 

h = i-X-'.HiZ) < /liminf "^°g""(^"^dP < liminf / "^"g'-^-^-^dP 

J n—^oo n n— >oo J n 

< lim sup / -^°g^"(^") rfp < £ . A-i • if (Z) = /i. (6.2) 

Thus, /i = lim„_!.c>o — ;^IE[log7r„(X„)] exists and the limit equals £ ■ \^^ ■ H{Z). D 

We get the following types of convergence: 

Corollary 6.6. (1) For almost every path of the random walk (^n)neNo, 

/. = liminf-^°g"-(^"^ 



n— >oo n 
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(2) Convergence in probability: 

1 



(3) Convergence in Li: 



logTTniXn) -> h. 

n 



--log7r„(X„) -i> h. 
n 



Proof. The proofs are completely analogous to the proofs in [71 Corollary 3.9, Lemma 3.10], 
where [71 Lemma 3.10] holds also in the case /i = 0. D 

Corollary 6.7. The entropy is the rate of escape with respect to the Greenian distance, 
that is, 

h= lim \ogG{o,Xn\l)dF. 

n— >oo n 

Proof. This follows from the simple fact G(o, X„|l) = G(o, o|l)L(o, X„|l) and Proposition 

7. CALCULATION OF THE ENTROPY 

In order to compute h = £ ■ X~^ ■ H{Z) we have to calculate the three factors: while there 
is a formula for i (given in [6l Theorem 2.4]) and there is also a formula for A (given in 
(15. 3p ). it remains to explain how to calculate H[7j). For this purpose, define 

i7(Zi,...,Z„):=- ^ P[Zi = Si,...,Z„ = sJlogP[Zi = Si,...,Z„ = s„], 
si,...,s„ew^ 

and let the conditional entropy i:/^(Z,„|Zi, . . . , Z„_i) be defined as 

- Y. IP[Zi=ii,...,Z„ = sJlogP[Z„ = s„|Zi = Si,...,Z„_i = s„_i]. 

By [H Theorem 4.2.1], H{Zi) = lim„_j.oo --^(Zi, . . . , Z.„). In general, the computation of 
H(7i) is a hard task. But there is a simple way in order to calculate H(Z) numerically, 
which is due to the inequalities 

H{Zn\{{l[''\4"^), (i5"\x("^)), Zi, . . .,Zn-l) < H{Z) < H{Zn | Zi, . . . , Z„_i) (7.1) 
for all n G N; see [H Theorem 4.5.1]. In particular, it is even shown that 

i7(Z„|Zi,...,Z„_i)-//(Z„|((i('^\xi'^)),(i(^\x('^))),Zi,...,Z„_i)^^^^0. 
Hence, one can calculate H{Z) numerically up to an arbitrarily small error. Furthermore: 
Corollary 7.1. // the random walk is expanding, then h > 0. Otherwise, h = 0. 

Proof. In the expanding case, the random walk has at least two possibilities for entering a 
subcone decsribed by Xg,, for every given value of X^^ ■ Thus, 

H{Z)>H{Z,\{(l[^\4^\(li^\4^^)),Z,)>0, 
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which yields h > due to (|7.ip . On the other hand side, if the random walk on C is not 
expanding, then each cone has a covering consisting of of only one single cone. Then the 
projections Z„ become deterministic and this implies 

< if (Z) < if (Z„ I Zi, . . . , Z„_i) = ii(Z„) = 0. 

n 

We call ab S A'^ unambiguous if dC{ab) = {ab}. In other words, whenever the random walk 
enters a subcone of type C{wab), w & A* , it must enter it through its single boundary 
point wab. This allows us to "cut" the random walk into pieces and to obtain another 
formula for the entropy if (Z). For n € N, X2, . . . ,Xn € Wo and ab € A^ define 

w{ab,X2,. . . ,Xn,x) := P[W2 = X2, • • • , W„ = x, [W„] = a6|Wi = aft] , 

w{ab,X2,...,Xn) ■■= ^ F[W2 = y2, ■■■, Wn = Vn, [^n] = ab\Wi = ab], 

where ~ is the relation introduced in the proof of Proposition [521 In particular, y{ab, X2) = 
P[W2 = X2, [W2] = a6 Wi = ab\ . Denote by z/i the invariant probability measure of the 
process (i^, Wfc)fc(=pj and set, for unambiguous ab € A^ , 

uw{ab) := ^ i^i{im,n,x). 

{im.n ,x)eyV:lx]=ab 

Then: 

Proposition 7.2. If ab € A^ is unambiguous, then 

h{Z) = -i^wiab)'^ ^ ^ w{ab,X2,...,Xn)logw{ab,X2,...,Xn). 

n>l X2,...Xn-i&Wo: x„&)A^o-lxn]=ab 
[xi]j^ab 

Proof. By Propositions 15.3 1 and 15.41 we have that 

-- logP[Yi = ii, . . . , Y„ = sj ^^^^ if (Z) 
n 

for almost every realisation (S]^, £2, . . . ) € W^. Observe that r(W„+i) = ab is equivalent 
to Y„ = {tn,Oitn,m) for some cone type i„ G I, where a denotes the cone type of C{ab) 
and 1 <m < n{tn, a)- For any such trajectory, we define 

No := min{m G N|T(Wm+i) = a} and Nk := min{m G N|m > iVfc_i, T(Wm+i) = a}. 

For any realisation (s;^,S2;---) ^ ^w ^^^ ra G N, denote by d{n) the maximal index k 
with Nk < n. Since [Wtv^+i] = ab for all A; G N we can use the strong Markov property as 
follows when Nj < n: 

P [Yat^+i = s^^.+i, . . . , Y„ = s„ I Yi = Si, . . . , Yn, = sjvj 
= P[Y7v^,+i = sj^.+i, . . . , Y„ = s„ I [Wat^+i] = ab] 

= P[Yjv^,+i = S^r +1, . . . , Y„ = S„ I Yn, = SjvJ . 
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Therefore, we can rewrite the following probability: 

IP[Yl =ll,---,Y^(n) = Sd(n)] 

= P[Yi = Si, . . .,Yno = sno]^[Yno+i = sno+1, ■ ■ ■,Yni = sn, \ Yno = IatJ • . . . • 

■^P'iYAfdlnj-l + l = ^7Vd(„)_l + l' • • • ' YiVdCn) = ^Afd(„) I Y7V^(n)-i = SAfd(„)-l] • 

Obviously, d[n)/n tends almost surely to uw{ah). Hence, if we consider only the subse- 
quence where n equals one of the iV^'s we obtain the following almost sure convergence: 

logP[Yi = Si,...,Yrf(„) = Srf(„)] 

d{n) 1 



n d{n) 



logP[Yi = Si,...,Y;Vo=l 



■A^oJ 



+ log P \Yno+1 = SNo+1 , • • • , YtVi = S^J YjVo = S^Vo] + • • • + 

+ logP[Y;v,(„)_i+i = S7V,(„)_,+i, ■ ■ ■ , Y^,(,^j = s^ |Y;v,(„)_i = Siv, 



n— >oo 



> -i/w(a6)^ ^ ^ w{ah,X2,....,Xk-i,x)\ogw{ab,X2,...,Xk-i,x). 

k>l X2,...,Xk-ieyVo- xeWo: 
lxi]^ab lx]=ab 

This proves the claim. D 

We now state an inequality which connects entropy, drift and growth. For this purpose, 
define the growth of A* as g := log |^|. Then we get: 

Proposition 7.3. h < i ■ g. 

Proof. Let e > 0. By Corollary 16.61 (1), there is some iV^ G N such that for all n > N^: 

1 - e < P[- log^„(X„) >{h- e)n, \Xn\ <{£ + e)n\ < e'^^-'^ ■ |^|(^+^)". 
Taking logarithms and dividing by n gives 

{h-e) + - log(l -e)<{l + e)- log |^|. 
n 

Making e arbirtraily small yields the proposed claim. D 

Finally, we remark that the entropy is zero for recurrent random walks: 
Corollary 7.4. // (-^n)neNo ^■^ recurrent then h = 0. 

Proof. Clearly, -iE[log7r„(X„)] > 0. Assume now that limsup^^g^ -^E[log7r.„(X„)] = 
c > 0. Then there is a deterministic sequence {nk)k£N such that, for any small ei > 0, 

- — E [log 7r„, (X„, )] > c - ei > (7.2) 

for all sufficiently large k. Denote by po the minimal occuring positive single-step transition 
probability. Then — — log 7r„j. ( Xn^, ) < — logpo- Moreover, choose A^ € N with 1/A^ < c—ei. 
Then there is some 5 > Q with 

log7r„j^(X„J > — : >5 VA; G N large enough. 
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To see this, assume that S = S^ depends on k with hminf^^oo'^fc = which leads to a 
contradiction to (I7.2p since 

(- logpo) • <5fc + (1 - 5k)— > E[log7r„,(^nJ] >c-ei. 

If 5k tends to zero then we get a contradiction to the choice of N . 

Choose now e > arbitrarily small with e < 5. In the recurrent case we have ^ = 0. Then 
there is some index if G N such that for a\\ k > K: 

6-e<¥[- log7r„,(X„J > n^/iV, |X„| < en^] < e^^^'/A^ . |^|enfe 
which yields the inequality 

— + — log(<5-e)<elog|^[. 

N Uk 

But this gives a contradiction if we make e sufficiently small since the right hand side tends 
to zero, but the left hand side to jy. Thus, limsup„_j.oQ — -E[log7r„(X„)] = 0, yielding 



8. Analyticity of Entropy 

The random walk on C depends on finitely many parameters which are described by the 
transition probabilities p{wi,W2), wi,W2 € A* with |u;i| < 2 and \w2\ < 3. That is, each 
random walk on £ can be defined via a vector p G l^l'^axBgl^ where Bi := U^^]^^" U {o}. 
The support of p is the set of indices in B2 x B3 corresponding to non-zero entries of p. 
Fix now any subset B C B2 >^ B3, which allows at least one well-defined random walk 
on A*, and consider in the following only vectors p with support B, which give rise to a 
well-defined random walk on A* . We ask whether the entropy mapping p >-^ h = hp varies 
real-analytically. The crucial point will be the following lemma: 

Lemma 8.1. The transition probabilities q{wi,'W2), wi,W2 € Wq, vary real-analytically 
w.r.t. p. 

Proof. Observe that analyticity of q{wi,W2) follows from analyticity of ^{ab), H{ab,c), 
ab € A'^, c G A and L{ab, cde), de € A^. Hence, we prove real-analyticity of these generating 
functions. The function z i-t- H{ab,c\z) has radius of convergence bigger than 1, which can 
be easily deduced from Lemma l6. II Thus, for 5 > small enough, we have 

00 > H{ab,c\l + 6) = ^Fab[Xn = c,ym < n : |X^| > 2](l + (5)". 

n>l 

The probability Pab[X„ = c,\tm < n : \Xm\ > 2] can be rewritten as 

^ c{ni,...,nd)pi^ ■ ...-py, 

ni,...,nrf>l: 

niH hnd=n 
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where pi, . . . ,P(i correspond to the non-zero entries of the vector p. Therefore, 

H{ab,c\l + 6) = Y^ Y^ c(ni,...,nrf)(pi(l + 5)r-...-(p,(l + <5)r<«). 

n>l ni,...,n^>l: 
niH \-nd=n 

Hence, p hes in the interior of the domain of convergence of H{ab,c\l) if seen as a multi- 
variate power series in terms of p. This yields real-analyticity of H(ab, c|l) in p. The same 
holds for ^{ab) and L{ab,cde), which is proven completely analogously since L{ab,cde\z) 
has also radius of convergence bigger than 1 , see proof of Lemma 16.11 D 

Now we can prove: 

Theorem 8.2. The entropy h varies real- analytically under all probability measures of 
constant support. 

Proof. The claim follows now easily via the equation h = £ ■ X~^ ■ H{Zi). By Lemma |8. 11 
vq (as the solution of a linear system of equations in terms of q(-, •) is real-analytic, so A 
is analytic. Moreover, by Han and Marcus |101 Theorem 1.1], H{Z) is also real-analytic. 
Real-analyticity of I can be shown completely analogously to the proof of Lemma [8. II with 
the help of the formula for i given in [H Theorem 2.4]. This finishes the proof. D 
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